Skip to content

clips/interpret_with_rules

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This project induces rules to explain the predictions of a trained neural network, and optionally also to explain the patterns that the model captures from the training data, and the patterns that are present in the original dataset. This code corresponds to the paper:

Rule Induction for global explanation of trained models
Madhumita Sushil, Simon Šuster and Walter Daelemans
Workshop on Analyzing and interpreting neural networks for NLP (BlackboxNLP), EMNLP 2018

The packages required are listed in requirements.txt. These dependencies must be satisfied to use the code.

The code uses python3, and can be run as follows:

To first train a neural document classifier (for the Science categories in the 20 newsgroups dataset) and then induce rules to explain the classifier's predictions, the following command is used:

python3 main.py -r gradient -loadmodel False -m <modelname.tar>

If we want to induce rules to explain a pretrained network, we can set the loadmodel option to True and input the pretrained model name. For replicating the exact results of the paper, we have provided the model we have explained under the name nn-model.tar.

To induce rules to identify the patterns in the original training data, the following command is used:

python3 main.py -r trainset -loadmodel False -m

The complete description of options can be checked by using the --help option like:

python3 main.py --help

About

Code for the paper "Rule induction for global explanation of trained models"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages